Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: fmonk Date: Nov 19, 2011 8:13am
Forum: web Subject: Allow IA to index, prevent search engines

I run a poetry site (self-hosted, WordPress) on which I track the evolution of my poems through a number of drafts. To prevent non-current drafts from showing up in search results I use meta robots noindex, follow.

The problem with that, I just realized, is the IA will probably skip those drafts, which I want archived—the main goals of the site are: to show how poetry evolves, to show potential writers that you will burn through several drafts, and to have a record of how poems mutate.

Is there anyway I can (easily in WordPress) have the IA index everything, while excluding old drafts from showing up in search results? Or, is there a way I can give IA permission to ignore robots meta and snake up everything (except for admin pages?)

I have about 150 drafts I want excluded from search engines but open to IA, the solution will ideally be one where I won't have to manually change _every_ draft.


Thanks,

ASIP.

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or StaffNemo_bis Date: Nov 3, 2013 3:23am
Forum: web Subject: Re: Allow IA to index, prevent search engines

On a standard website, the easiest approach would be to put those you want to be archived in a different directory, use robots.txt instead of meta tags and whitelist the Wayback machine for that directory (though I don't know if Allow: is accepted by the machine), User-agent: ia_archiver according to https://archive.org/about/exclude.php